[fix](be) Fix variant inverted-index cast pushdown for int and boolean#63118
[fix](be) Fix variant inverted-index cast pushdown for int and boolean#63118wuguowei1994 wants to merge 1 commit into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
e75111a to
904d4c0
Compare
|
run buildall |
|
/review |
eldenmoon
left a comment
There was a problem hiding this comment.
I found a correctness blocker in the relaxed variant predicate compatibility check. The current regression covers only the same-width CAST(... AS INT) case, but this change also enables cross-width integer casts and same-family string casts without normalizing the predicate value to the segment storage encoding.
TPC-H: Total hot run time: 29643 ms |
TPC-DS: Total hot run time: 170611 ms |
d4d86b8 to
6c2b533
Compare
There was a problem hiding this comment.
Thanks for updating the patch to avoid unsafe cross-width index encoding. I still think this needs changes because the latest exact-type rule no longer fixes the original CAST(v["int_key"] AS INT) scenario described in the PR body. Critical checkpoints: goal/test coverage is not satisfied because only same-width TINYINT pushdown is now proven while the documented INT reproduction remains non-pushdown; the code change is small and focused; no new concurrency, lifecycle, config, persistence, FE-BE protocol, or storage-format compatibility concerns were introduced; the main correctness risk is now an incomplete fix rather than wrong-result pushdown; observability is unchanged and adequate for this path through the existing debug/profile checks. User focus: no additional user-provided focus was specified.
|
currently only bigint in interger types will be infered |
6c2b533 to
3096cdc
Compare
3096cdc to
d397fa1
Compare
TPC-H: Total hot run time: 31829 ms |
TPC-DS: Total hot run time: 169281 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
37a0544 to
571e6dd
Compare
TPC-H: Total hot run time: 31058 ms |
TPC-DS: Total hot run time: 169495 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
### What problem does this PR solve? Related PR: apache#63118 Problem Summary: Variant subcolumn predicates such as cast(v["int_key"] as bigint) IN (...) and cast(v["float_key"] as double) = 1.5 were not being pushed down to the inverted index. The FE plan wraps these as nested casts of the form CAST(CAST(slot(v) AS storage_dtype) AS user_target), but three BE code paths only accepted a single-level CAST(slot): - _filter_and_collect_cast_type_for_variant() returned early when the outer cast's child was not SLOT_REF, so it never recorded the user target type and the slot kept its original VARIANT value range. - is_valid_push_down_cast() required children[0]->children().at(0) to be a slot ref, so _is_predicate_acting_on_slot() rejected nested casts and skipped column-predicate construction. - _evaluate_inverted_index() in the common-expr-pushdown path also required cast_expr->get_child(0)->is_slot_ref(), so the second-pass index probe could not peel the cast either. This change peels the whole cast chain via VExpr::expr_without_cast() in all three places, while still using the outermost cast's target type for compatibility / round-trip checks. After the fix the column predicate path constructs widened ColumnValueRange / ColumnPredicate correctly, and convert_to_storage_value() normalizes each literal back to the storage type before probing. ### Release note None ### Check List (For Author) - Test: No need to test (covered by regression tests in apache#63118 once the test expectations are aligned in the follow-up commit) - Behavior changed: No - Does this need documentation: No Co-authored-by: Cursor <cursoragent@cursor.com>
…_without_cast ### What problem does this PR solve? Related PR: apache#63118 Problem Summary: The original regression-test/suites/inverted_index_p0/ test_variant_inverted_index_cast.groovy had three issues: - The IndexFilter section and HitRows counters appear in the profile as soon as the segment iterator initializes its inverted-index runtime, even when no predicate was actually pushed down. Asserting on profileText.contains("IndexFilter:") and on HitRows is therefore brittle. Switch the "index used" / "index not used" judgement to RowsInvertedIndexFiltered (rows the inverted index actually removed from the scan), which is exactly zero iff the index did not prune anything. - The cast(v["int_key"] as double) = 13.0 case is folded by Nereids into CAST(v AS int) = 13 and goes through the existing INT equality path, not the new BE widening logic. Drop that case (and the related OR variant) since the assertion no longer matches the PR semantics. - The cast(v["string_key"] as varchar(20)) case is wrapped by the FE as substring(CAST(CAST(v AS text) AS varchar(20)), 1, 20), which is outside the slot/cast-only contract this PR enables. Drop the case and document the limitation; the cast-to-text case continues to cover string-family widening. To balance the deletions, add a real negative case cast(v["int_key"] as bigint) = 5000000000 to verify that convert_to_storage_value() rejects out-of-range literals at probe time and falls back to full scan (RowsInvertedIndexFiltered == 0, ScanRows == 20) while still returning the correct empty result. Also add four lightweight unit tests in be/test/exprs/vexpr_test.cpp to pin down VExpr::expr_without_cast() behavior (no-cast / single-level / nested / non-slot leaf), since the variant pushdown paths fixed in the previous commit rely on it to find the underlying slot beneath a chain of FE-emitted casts. ### Release note None ### Check List (For Author) - Test: Regression test (this commit only adjusts test expectations and adds new test coverage) / Unit Test - Behavior changed: No - Does this need documentation: No Co-authored-by: Cursor <cursoragent@cursor.com>
…ed indexes ### What problem does this PR solve? Issue Number: None Related PR: apache#63118 Problem Summary: Inverted index predicate pushdown did not support safe widening casts on indexed VARIANT subcolumns, so type-compatible cast predicates could not use the inverted index and scanned more rows than necessary. This change allows storage-compatible cast predicates to be pushed down and converts compatible query literals to the segment storage type before building inverted-index query values, keeping the query encoding consistent with the indexed storage representation. ### Release note Support inverted index predicate pushdown for safe widening cast predicates on VARIANT subcolumns. ### Check List (For Author) - Test: Added regression and unit test coverage - Regression test: test_variant_inverted_index_cast covers positive and negative VARIANT inverted-index cast pushdown cases. - Unit Test: vexpr_test and inverted_index_reader_test cover cast peeling and storage-value conversion. - Behavior changed: Yes. Compatible VARIANT subcolumn cast predicates can now use inverted-index pushdown; unsupported or unsafe casts remain unpushed. - Does this need documentation: No
17f631e to
64bc776
Compare
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 31486 ms |
TPC-DS: Total hot run time: 169788 ms |
### What problem does this PR solve? Issue Number: None Related PR: apache#63118 Problem Summary: Remove redundant BE cast-chain peeling changes and broad regression coverage for cases already handled by existing FE rewrite rules. Keep this PR focused on the BE-side support still needed for safe widening casts that remain visible to inverted-index evaluation, such as auto-inferred BIGINT paths cast to LARGEINT and explicitly typed FLOAT paths compared through DOUBLE literals. ### Release note None ### Check List (For Author) - Test: - Static check: `git diff --cached --check`. - Manual test: previously compared master and this PR build on `10.106.128.180`. - Not run: full regression suite / BE UT after this cleanup. - Behavior changed: No. This removes redundant code and narrows tests to the actual supported behavior. - Does this need documentation: No
832a4a0 to
6f6fc5b
Compare
### What problem does this PR solve? Issue Number: None Related PR: apache#63118 Problem Summary: Fix a C++ compilation error where VCastExpr stored a shared_ptr IndexExecContext return value in an auto* variable. ### Release note None ### Check List (For Author) - Test: Not run (per request; user will run formatting and tests) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#63118 Problem Summary: Add regression coverage for variant inverted-index cast pushdown when BIGINT storage values are outside the TINYINT range. ### Release note None ### Check List (For Author) - Test: Not run (per request) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#63118 Problem Summary: Fix variant inverted-index cast pushdown so query literals are converted to the segment storage type before index probing, including positive IN predicates that narrow integer query types. This prevents column-level inverted-index evaluation from pruning valid rows for predicates such as cast(variant_path as tinyint) IN (...), and adds boundary coverage for out-of-range narrowing values. ### Release note None ### Check List (For Author) - Test: Manual test - Ran build-support/clang-format.sh be/src/storage/predicate/in_list_predicate.h - build-support/check-format.sh did not complete because local clang-format is not version 16 - BE unit test was not completed: run-be-ut.sh attempted submodule setup/download and was blocked by sandbox/network; elevated retry was interrupted - Regression test was not run because this worktree has no output/ Doris cluster - Behavior changed: No - Does this need documentation: No
4f9064f to
47b3565
Compare
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Fix the regression test expectation for tinyint overflow constants. These predicates are folded to an empty relation by FE, so no BE inverted-index profile counters are produced. ### Release note None ### Check List (For Author) - Test: Manual test - Verified from query profile that tinyint overflow predicates are planned as PhysicalEmptyRelation and do not produce inverted-index counters. - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Refine the variant inverted-index cast pushdown fix after review by clarifying integer cross-width conversion gates, documenting the round-trip invariant, reusing null-bitmap result construction, and extending conversion unit coverage. ### Release note None ### Check List (For Author) - Test: No need to test (not run per request) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Add missing coverage for inverted-index cast pushdown conversion boundaries, including double/float round-trip behavior, int-to-decimal conversion, out-of-range integer skip, string compatibility, negative predicates, and large bigint variant values. ### Release note None ### Check List (For Author) - Test: No need to test (not run per request) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Document why skipped literals are safe to ignore while building positive IN inverted-index results after storage-value conversion fails the round-trip check. ### Release note None ### Check List (For Author) - Test: No need to test (comment-only change) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Add non-matching bigint variant rows so the large-value inverted-index profile assertion has rows to filter and remains stable. ### Release note None ### Check List (For Author) - Test: No need to test (not run per request) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Keep the large bigint variant cast regression as a result-correctness case without requiring an unstable inverted-index profile counter for that key. ### Release note None ### Check List (For Author) - Test: No need to test (not run per request) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Move the large bigint variant key into the main inserted batch so it is an extracted indexed subcolumn, then restore the profile assertion that verifies inverted-index filtering is used. ### Release note None ### Check List (For Author) - Test: No need to test (not run locally) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Allow constant cast expressions on the literal side of comparison and IN predicates to continue through inverted-index evaluation. Large integer constants can be represented as cast literals, and the previous non-slot CAST_EXPR early return skipped index pushdown entirely. ### Release note None ### Check List (For Author) - Test: No need to test (not run locally) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Revert the constant-cast inverted-index pushdown change because environment comparison showed the large bigint regression uses the inverted index on both old master and the PR build when the test data is shaped correctly. The prior source change was based on inference rather than a reproduced source defect. ### Release note None ### Check List (For Author) - Test: Manual test - Compared old master and PR build on the provided environment with the same large bigint variant query; both reported RowsInvertedIndexFiltered=19 and InvertedIndexQueryTime > 0. - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close apache#63118 Related PR: apache#63118 Problem Summary: Keep the large bigint variant regression checking that inverted-index filtering is used, but do not require ScanRows to be exactly 1. Environment comparison showed the corrected query reports RowsInvertedIndexFiltered=19 and InvertedIndexQueryTime > 0 on both old master and the PR build, while ScanRows is 5. ### Release note None ### Check List (For Author) - Test: Manual test - Compared old master and PR build on the provided environment for the large bigint variant query; both used inverted-index filtering with RowsInvertedIndexFiltered=19. - Behavior changed: No - Does this need documentation: No
8eef016 to
f6ef2d9
Compare
f6ef2d9 to
9afb062
Compare
Summary
On current master, inverted index pushdown is not correctly applied for some casted VARIANT predicates.
This PR focuses on two production-critical patterns:
CAST(v["int_key"] AS INT) = <value>CAST(v["bool_key"] AS BOOLEAN)(equivalent to= truein filter context)In our production workloads, VARIANT is heavily used and teams are required to query subfields with explicit
CAST.When these predicates cannot use inverted index filtering, query latency and resource usage increase significantly.
Business Context
In our production workloads, each business table may contain a large number of dynamic JSON keys (often hundreds of keys over time).
Because of this, it is not feasible to predefine typed paths for every subkey.
Under this constraint, the common usage pattern across teams is:
vCAST(v["int_key"] AS INT))The issue appears exactly on this mainstream pattern: if casted subfield predicates cannot leverage inverted-index pushdown, scans degrade toward near full-row scanning on large datasets, which causes major latency and resource impact in production.
Reproduction
Case 1: INT cast equality
Case 2: BOOLEAN cast predicate
Expected Behavior
CAST(v["int_key"] AS INT) = 13, predicate should be pushed down and effectively filtered by inverted index.CAST(v["bool_key"] AS BOOLEAN), predicate should also be transformed/pushed down to an index-evaluable form.Actual Behavior (before this PR)
RowsInvertedIndexFiltered = 0, scanned rows remain high).Root Cause
CAST(v["int_key"] AS INT) = ..., the query literal type isINT, while auto-inferred storage for integer subkeys is typicallyBIGINT.CAST(v["bool_key"] AS BOOLEAN)in filter context, current master does not build an index-evaluable predicate path for this cast form, so pushdown is also missed.What this PR fixes
CAST(v["int_key"] AS INT) = <value>CAST(v["bool_key"] AS BOOLEAN)in filter context